Universal behavior of localization of residue fluctuations in globular proteins 
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Localization properties of residue fluctuations in globular proteins are studied theoretically by using 
the Gaussian network model. Participation ratio for each residue fluctuation mode is calculated. It 
is found that the relationship between participation ratio and frequency is similar for all globular 
proteins, indicating a universal behavior in spite of their different size, shape, and architecture. 



PACS numbers: 87.15.Ya, 87.15.He, 87.14.Ee 

Proteins are important biological macromolecules that 
control almost all functions of living organisms. It was 
once believed that proteins are rather amorphous and 
without well-defined structures. After more and more 
structures have been determined by crystallographic and 
NMR methods, it has revealed that protein structures are 
far from random. They have well-defined secondary and 
tertiary structures which comprise essential information 
relating to their functions and mechanisms. 

Proteins in the folded states are not static. Instead, the 
constituent residues fluctuate near their native positions 
owing to the finite temperature effects [1]. It has been 
now well accepted that the fluctuations are crucial for en- 
zyme catalysis and for biological activity [2,3]. Recently, 
there has been considerable interest in the correlations 
between protein functions and fluctuations [2] . Intensive 
theoretical studies on fluctuations of protein have been 
carried out based on either molecular dynamics simula- 
tions or normal mode analyses (NMA) by using all-atom 
empirical potentials [4] . It has been shown that the NMA 
is a very useful method to study protein fluctuations [5,6] . 
The use of atomic approaches becomes computational de- 
manding when dealing with large proteins. For proteins 
composed of more than thousand residues, it is difficult 
to investigate by using the conventional atomic models 
and potentials. On the other hand, coarse-grained pro- 
tein models and simplified force fields have revealed a 
great success in description of the residue fluctuations of 
proteins [7-10]. Although there have been intensive stud- 
ies on residue fluctuations, to our knowledge, there is few 
study on localization properties of residue fluctuations. 

In this paper, based on a coarse-grained protein model, 
we show theoretically that there is a similar behavior 
in the localization of residue fluctuations for globular 
proteins, even though their architectures and sizes are 
rather different. In our study of residue fluctuations, 
proteins are modeled as elastic networks. The nodes are 
residues linked by inter-residue potentials that stabilizes 
the folded conformation. This model has been usually re- 
ferred as the Gaussian network model (GNM), which can 
give a satisfactory description of the fluctuation of folded 
proteins [8,9,11-15]. In this model, residues are assumed 
to undergo Gaussian-distributed fluctuations about their 
native positions. No distinction is made between different 
types of residues. A single generic harmonic force con- 



stant is used for the inter-residue interaction potential 
within a cutoff range. We consider residues as the min- 
imal representative units and the a-carbons are used as 
corresponding sites for residues. Considering all contact- 
ing residues, the internal Hamiltonian within the GNM 
is given by [8,9] 
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i? = -7 [AR'^ (r ® E) AR] 



(1) 



where 7 is the harmonic force constant; {AR} repre- 
sents the 3iV-dimensional column vectors of fluctuations 
ARi, . . . , ARat of the C" atoms, where TV is the num- 
ber of residues; E is the third order identity matrix; the 
superscript T denotes the transpose; (g) stands for the di- 
rect product, and T is the N x N Kirchhoff matrix [16] 
with the elements given by 
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« = J- 
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Here, Vij is the separation between the i-th and j-th C" 
atoms; H{x) is the Heaviside step function, and Tc is the 
cutoff distance outside of which there is no inter-residue 
interaction. The z-th diagonal element of F characterizes 
the local packing density or the coordination number of 
residue i. The inverse of the Kirchhoff matrix can be 
decomposed as 



F-i = UA-^U^, 



(3) 



where U is an orthogonal matrix whose columns 
Ui (1 < i < A^) are the eigenvectors of F, and A is di- 
agonal matrix of eigenvalue Ai of F. Cross-correlations 
of residue fluctuations between the z-th and j-th residues 
are found from 



[AR,.AR,] = ^[F-i], 



(4) 



From Eqs. (3) and (4), the mean-square (ms) fluctuations 
(also called Debye- Waller or B-factors) of the i-th residue 
associated with the a-th mode are given by 



[AR, • AR4 = ^A^i [uo 



(5) 
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In our calculation, the cutoff distance Tc = 7 A is used, 
as adopted in previous studies [8,9]. The harmonic force 
constant 7 is determined by fitting to the experimental 
ms fluctuations. From this model, one can obtain the 
fluctuation mode frequencies and eigenvectors for a given 
protein. The GNM can in general give results in good 
agreement with the observed B-factors [8,9]. 

The spatial distribution of a given mode is character- 
ized by its eigenvectors. To study the localization prop- 
erties of protein fluctuations, we have to compute partic- 
ipation ratio (PR) for each mode, defined by [17] 

^« = ^(EM-) • (6) 

Values of PR range from 1/A'' to unity. PR takes the 
value of unity if all residues have equal fluctuation. If 
only one residue fluctuates PR is equal to 1/A''. From its 
deflnition, it is obvious that PR is a measure of the degree 
of localization. If the PR is small for a given mode, only a 
few residues have considerable fluctuations and the mode 
is a localized one. On the other hand, if the PR is large 
for a given mode, the mode is delocalized. 

It is known that at the physiological temperatures, pro- 
tein fluctuates among different conformations around its 
native one. Therefore, in principle, all contributions from 
these conformations should be considered in the calcula- 
tion of PR. Unfortunately, only one conformation could 
be obtained from experiments. However, these conforma- 
tions could be obtained approximately by the following 
way. For each residue, it is assumed that it can stay 
at any position inside the sphere with a radius of half 
the magnitude of fluctuation centered on the position ob- 
tained from experiments. A conformation can be derived 
by a random choice of the position for each residue while 
the inter-distance between two adjacent residues is kept 
unchanged within the framework of the SHAKE algo- 
rithm [18]. 

The calculated PR for several proteins is shown in Fig. 
1. The Brookhaven Protein Databank (PDB) codes and 
references of the proteins studied are listed in Table I. 
The modes are numbered starting from the lowest fre- 
quency. In the calculations about 100 conformations are 
adopted. It is found that if more conformations are used, 
the curves will become smoother eventually. 

Based on the Anderson localization theory [23,24], Ba- 
har et al. [9] suggested that modes with larger fluctu- 
ation frequencies would be more localized, indicating a 
monotonous decrease in PR with frequency. As sug- 
gested by Onuchic et al. [25] proteins are neither ordered 
nor random systems, the localization properties of pro- 
tein fluctuations should show some intrinsic features from 
those in ordered or random systems. 

It can be seen from Fig. 1 that starting from lowest fre- 
quency, PR first decreases with frequency, then increase, 
and finally decreases with frequency. A large number 
of globular proteins which have diversified topology, sec- 
ondary structure arrangement and size are calculated. 



This behavior of PR seems to be universal, holding for 
all globular proteins. Other molecular systems such as 
tRNA are also calculated. But the behavior of PR is 
qualitatively different from proteins (data not shown). 
So it is reasonable to conjecture that the different behav- 
ior of PR in globular proteins from other systems reflects 
the intrinsic difference of certain properties. Recently, 
Micheletti et al. [27] studied the localization properties 
of HIV-1 protease. A similar behavior of PR in HIV-1 
protease was found. 

To study the origin of the behavior of PR in globu- 
lar proteins, the fluctuation patterns of the protein myo- 
globin at different frequency regions are given in Fig. 2. 
The different frequency regions in the figure are labeled 
by different letters (see Fig. 1). In the low frequency 
region A, the fiuctuations represent a collective motion, 
characterized by large values of PR. In the region B, the 
PR is small, implying localized fluctuations. It is inter- 
esting to note that in this region the fluctuations occur 
dominantly at the loops. In the highest frequency region 
D, the fluctuations are found to be confined to the sec- 
ondary structures, resulting small PR. In the region C, 
one can find that motions of both loops and secondary 
structures are involved. The degree of localization is, 
however, smaller than that in regions B and D, but it is 
larger than that in region A. Therefore, it can be con- 
cluded from the fiuctuation patterns that the dip of PR 
occurred at lower frequency side (region B) originates 
from the localized fluctuations at loops that connect the 
secondary structures. For conventional disordered solids 
or random coils, there are nearly no well-defined sec- 
ondary structures and consequently no loops. The re- 
sulting PR will show a different behavior. It is obvious 
that the different behavior of PR in globular proteins 
from that of conventional random solids or coils origi- 
nates from the different nature of structures. 

To get a deeper insight into how the localization prop- 
erties are affected by the topology, a lattice model [26] 
with different length of the loop is adopted. In this 
model, a protein is represented by a self-avoiding chain 
of beads placed on a two-dimensional discrete lattice. In 
construction of this model protein, one must consider 
the fact that the secondary structure has higher pack- 
ing density while the loop has lower packing density. A 
core region is introduced by making two helices contacted 
each other since cores, with higher packing density, are 
important to stabilize the whole structure. Our model 
protein shown in Fig. 3(a) consists of two helices, a con- 
nective loop, and a core. All residues (beads) are treated 
identically. In our calculations only the nearest neighbor 
interaction is considered. 

Advantages of the lattice model are that we can change 
the structure as desired to get insight into how the residue 
fiuctuations are affected by the changes in structures, 
which is difficult to do in real proteins. In Fig. 3(b) 
the calculated PR by the GNM for the model protein 
with different loop length is shown. The loop length is 
changed by moving the loop horizontally to the left or 
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right. The curves are smoothed simply by adjacent av- 
eraging using 10 points. It is obvious that the PR of 
fluctuations in the simple model protein shows a similar 
behavior to that of real globular proteins. With increase 
in the loop length, the PR values of both the dip (region 
B in Fig. 1) and the peak (region C in Fig. 1) decrease. 
It can be seen from Fig. 3(c) that at the dip the fluc- 
tuations are dominant in the loop region. Again, the 
origin of the dip is the cause of the loop. For the high- 
est frequency mode, the fluctuations dominantly occur 
at the helices, especially at the core region. The broad 
peak comprises modes which are more dclocalizcd and 
worse defined. These peaks are relevant to the coupling 
motions among secondary structures. 

In summary, localization properties of fluctuations in 
globular proteins were studied by using the Gaussian net- 
work model. It was found that the participation ratio of 
fluctuations in globular proteins shows a universal be- 
havior, confirmed by theoretical calculations in both real 
globular and model proteins. The loops connecting the 
secondary structures are responsible for this feature. 
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FIG. 1. Calculated participation ratio of residue fluctuar 
tions for proteins listed in Table I. 
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Figure 2. Y. W\i, et al. 




TABLE I. The PDB code and reference of proteins studied 
in the present work. 



Protein 


PDB code 


Reference 


Myoglobin 


Ibvc 


[19] 


Lysozyme 


1661 


[20] 


Hydrolase 


lamp 


[21] 


Thermolysin 


5tln 


[22] 
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FIG. 2. Calculated projected residue ms fluctuations in dif- 
ferent frequency regions for myoglobin. The secondary struc- 
tures of myoglobin are represented by the horizontal segment 
heavy lines at the top of the figure. The remaining are loops. 




Frequency number Residue number 

FIG. 3. (a) Lattice model protein consists of two helices, a 

loop and a core region. The loop length can be changed by 
moving the loop horizontally to the right, (b) Participation 
ratio for model proteins with different loop length. Solid line 
is for the model protein shown in (a) with loop length of 13o, 
where a is the lattice constant. Dotted and dashed lines are 
for model proteins with loop length of 23a and 33a, respec- 
tively, (c) Projected residue ms fluctuations in arb. units for 
the modes with smallest PR in the dip region (dashed line) 
and with largest frequency (solid line). 
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